AITopics

Country:

Europe (1.00)
Asia > Middle East (0.28)

Genre: Research Report (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsFeb-7-2026, 22:44:20 GMT

HowCanIExplainThistoYou?AnEmpiricalStudy ofDeepNeuralNetworkExplanationMethods

Although many of these toolkits are available for use, it is unclear which style of explanation is preferred by end-users, thereby demanding investigation.

artificial intelligence, explanation, machine learning, (19 more...)

Country:

North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

MIT Technology ReviewFeb-5-2026, 13:10:00 GMT

The Download: attempting to track AI, and the next generation of nuclear power

Plus: Anthropic's new tools are freaking out the markets Every time OpenAI, Google, or Anthropic drops a new frontier large language model, the AI community holds its breath. It doesn't exhale until METR, an AI research nonprofit whose name stands for "Model Evaluation & Threat Research," updates a now-iconic graph that has played a major role in the AI discourse since it was first released in March of last year. The graph suggests that certain AI capabilities are developing at an exponential rate, and more recent model releases have outperformed that already impressive trend. That was certainly the case for Claude Opus 4.5, the latest version of Anthropic's most powerful model, which was released in late November. In December, METR announced that Opus 4.5 appeared to be capable of independently completing a task that would have taken a human about five hours--a vast improvement over what even the exponential trend would have predicted. But the truth is more complicated than those dramatic responses would suggest.

large language model, machine learning, natural language, (20 more...)

MIT Technology Review

Country: North America > United States (0.30)

Industry:

Media (1.00)
Leisure & Entertainment (0.73)
Energy > Power Industry > Utilities > Nuclear (0.54)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

WIREDJan-10-2026, 01:11:25 GMT

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

To prepare AI agents for office work, the company is asking contractors to upload projects from past jobs, leaving it to them to strip out confidential and personally identifiable information. OpenAI is asking third-party contractors to upload real assignments and tasks from their current or previous workplaces so that it can use the data to evaluate the performance of its next-generation AI models, according to records from OpenAI and the training data company Handshake AI obtained by WIRED. The project appears to be part of OpenAI's efforts to establish a human baseline for different tasks that can then be compared with AI models. In September, the company launched a new evaluation process to measure the performance of its AI models against human professionals across a variety of industries. OpenAI says this is a key indicator of its progress towards achieving AGI, or an AI system that outperforms humans at most economically valuable tasks. "We've hired folks across occupations to help collect real-world tasks modeled off those you've done in your full-time jobs, so we can measure how well AI models perform on those tasks," reads one confidential document from OpenAI.

contractor, information, openai, (13 more...)

WIRED

Country:

North America > The Bahamas (0.15)
North America > United States > California (0.05)
Europe > Slovakia (0.05)
(2 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.98)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Neural Information Processing SystemsDec-26-2025, 20:15:52 GMT

GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition

Current dataset collection methods typically scrape large amounts of data from the web. While this technique is extremely scalable, data collected in this way tends to reinforce stereotypical biases, can contain personally identifiable information, and typically originates from Europe and North America. In this work, we rethink the dataset collection paradigm and introduce GeoDE, a geographically diverse dataset with 61,940 images from 40 classes and 6 world regions, and no personally identifiable information, collected by soliciting images from people across the world. We analyse GeoDE to understand differences in images collected in this manner compared to web-scraping. Despite the smaller size of this dataset, we demonstrate its use as both an evaluation and training dataset, allowing us to highlight shortcomings in current models, as well as demonstrate improved performance even when training on this small dataset.

geode, geographically diverse evaluation dataset, name change, (3 more...)

Country:

North America (0.28)
Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Neural Information Processing SystemsNov-19-2025, 20:44:33 GMT

GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition

The creation of large-scale image datasets has enabled advances in the performance of computer vision models.

artificial intelligence, deep learning, machine learning, (17 more...)

Country:

North America > United States (0.14)
Asia > East Asia (0.05)
Europe > Western Europe (0.05)
(22 more...)

Genre: Research Report (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceNov-11-2025

Exploratory Analysis of Cyberattack Patterns on E-Commerce Platforms Using Statistical Methods

Adeniya, Fatimo Adenike

Cyberattacks on e-commerce platforms have grown in sophistication, threatening consumer trust and operational continuity. This research presents a hybrid analytical framework that integrates statistical modelling and machine learning for detecting and forecasting cyberattack patterns in the e-commerce domain. Using the Verizon Community Data Breach (VCDB) dataset, the study applies Auto ARIMA for temporal forecasting and significance testing, including a Mann-Whitney U test (U = 2579981.5, p = 0.0121), which confirmed that holiday shopping events experienced significantly more severe cyberattacks than non-holiday periods. ANOVA was also used to examine seasonal variation in threat severity, while ensemble machine learning models (XGBoost, LightGBM, and CatBoost) were employed for predictive classification. Results reveal recurrent attack spikes during high-risk periods such as Black Friday and holiday seasons, with breaches involving Personally Identifiable Information (PII) exhibiting elevated threat indicators. Among the models, CatBoost achieved the highest performance (accuracy = 85.29%, F1 score = 0.2254, ROC AUC = 0.8247). The framework uniquely combines seasonal forecasting with interpretable ensemble learning, enabling temporal risk anticipation and breach-type classification. Ethical considerations, including responsible use of sensitive data and bias assessment, were incorporated. Despite class imbalance and reliance on historical data, the study provides insights for proactive cybersecurity resource allocation and outlines directions for future real-time threat detection research.

data mining, machine learning, pattern recognition, (20 more...)

2511.0302

Country:

North America > United States > California (0.27)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.26)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Services > e-Commerce Services (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > e-Commerce (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
(5 more...)

Serenari, Jayden, Lee, Stephen

Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services

arXiv.org Artificial IntelligenceNov-3-2025

With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models may contain Personally Identifiable Information (PII), which, if exposed, could lead to security breaches or identity theft. To address this challenge, we present the Local Optimizations for Pseudonymization with Semantic Integrity Directed Entity Detection (LOPSIDED) framework, a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote LLMs. Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms, preserving the contextual integrity of conversations. Once the model generates its response, the pseudonyms are automatically depseudonymized, ensuring the user receives an accurate, privacy-preserving output. We evaluate our approach using real-world conversations sourced from ShareGPT, which we further augment and annotate to assess whether named entities are contextually relevant to the model's response. Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques, all while enhancing privacy.

large language model, machine learning, natural language, (18 more...)

2510.27016

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Uzor, GodsGift, Al-Qudah, Hasan, Ineza, Ynes, Serwadda, Abdul

Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models

arXiv.org Artificial IntelligenceAug-26-2025

--The interactive nature of Large Language Models (LLMs), which closely track user data and context, has prompted users to share personal and private information in unprecedented ways. Even when users opt out of allowing their data to be used for training, these privacy settings offer limited protection when LLM providers operate in jurisdictions with weak privacy laws, invasive government surveillance, or poor data security practices. In such cases, the risk of sensitive information, including Personally Identifiable Information (PII), being mishandled or exposed remains high. T o address this, we propose the concept of an "LLM gatekeeper", a lightweight, locally run model that filters out sensitive information from user queries before they are sent to the potentially untrustworthy, though highly capable, cloud-based LLM. Through experiments with human subjects, we demonstrate that this dual-model approach introduces minimal overhead while significantly enhancing user privacy, without compromising the quality of LLM responses. Large Language Models (LLMs) like ChatGPT have revolutionized digital interactions by providing personalized, context-aware responses that evolve with the dialogue. Unlike traditional information sources, LLMs' dynamic engagement often leads users to share increasingly personal details over multiple sessions, sometimes unknowingly. This gradual accumulation of sensitive information, compounded by the public's limited understanding of risks like neural network memorization, increases the likelihood of unintentional disclosure. The issue is further exacerbated when proprietary LLMs operate in jurisdictions with weak privacy regulations, limited data security, or invasive governmental surveillance.

large language model, machine learning, natural language, (21 more...)

doi: 10.1109/ICSC64641.2025.00040

2508.16765

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

arXiv.org Artificial IntelligenceAug-12-2025

A DICOM Image De-identification Algorithm in the MIDI-B Challenge

Jiang, Hongzhu, Xie, Sihan, Wan, Zhiyu

Image de-identification is essential for the public sharing of medical images, particularly in the widely used Digital Imaging and Communications in Medicine (DICOM) format as required by various regulations and standards, including Health Insurance Portability and Accountability Act (HIPAA) privacy rules, the DICOM PS3.15 standard, and best practices recommended by the Cancer Imaging Archive (TCIA). The Medical Image De-Identification Benchmark (MIDI-B) Challenge at the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024) was organized to evaluate rule-based DICOM image de-identification algorithms with a large dataset of clinical DICOM images. In this report, we explore the critical challenges of de-identifying DICOM images, emphasize the importance of removing personally identifiable information (PII) to protect patient privacy while ensuring the continued utility of medical data for research, diagnostics, and treatment, and provide a comprehensive overview of the standards and regulations that govern this process. Additionally, we detail the de-identification methods we applied - such as pixel masking, date shifting, date hashing, text recognition, text replacement, and text removal - to process datasets during the test phase in strict compliance with these standards. According to the final leaderboard of the MIDI-B challenge, the latest version of our solution algorithm correctly executed 99.92% of the required actions and ranked 2nd out of 10 teams that completed the challenge (from a total of 22 registered teams). Finally, we conducted a thorough analysis of the resulting statistics and discussed the limitations of current approaches and potential avenues for future improvement.

artificial intelligence, information, machine learning, (14 more...)

2508.07538

Genre:

Research Report (0.83)
Overview (0.54)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.67)